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Abstract — Currently for the Ut x nn MIMO channel, any 
explicitly constructed space-time (ST) designs that achieve opti- 
mally with respect to the diversity multiplexing tradeoff (DMT) 
are known to do so only when decoded using maximum like- 
lihood (ML) decoding, which may incur prohibitive decoding 
complexity. In this paper we prove that MMSE regularized 
lattice decoding, as well as the computationally efficient lattice 
reduction (LR) aided MMSE decoder, allows for efficient and 
DMT optimal decoding of any approximately universal lattice- 
based code. The result identifies for the first time an explicitly 
constructed encoder and a computationally efficient decoder that 
achieve DMT optimality for all multiplexing gains and all channel 
dimensions. The results hold irrespective of the fading statistics. 

I. Introduction 

The introduction of MIMO-related scenarios such as 
MIMO-OFDM and cooperative-diversity has introduced the 
need for multi-dimensional encoding schemes which can be 
efficiently decoded and which can guarantee good error prob- 
ability performance under a plethora of channel topologies and 
statistics. Towards addressing this need, substantial amounts of 
research has looked to improve and analyze the error probabil- 
ity performance and decoding complexity of different MIMO 
encoding/decoding schemes, with such work often focusing on 
applying specific error probability performance measures to 
analyze the behavior of specific transmission schemes as they 
are decoded by different, optimal and suboptimal decoders. 

A. Related work 

From the encoding point of view, substantial research has 
aimed towards providing space-time (ST) codes with structure 
that allows for good error probability performance and efficient 
decoding. Such work include [1] which provides codes based 
on Clifford algebras that can be seen as generalizations of 
orthogonal designs and which have good maximum likelihood 
(ML) decoding complexity. Furthermore, the work in [2] 
describes codes that take advantage of channel asymmetry 
(riR < tit) to achieve good error probability performance with 
reduced decoding complexity. 

From the point of view of detection and decoding, ML- 
based decoders are known to provide optimal performance 
but often do so only with prohibitive computational complex- 
ity. Different computationally efficient sub-optimal receiver 
architectures were introduced, with work focusing on linear 



receivers (MMSE and ZF) and their decision feedback equal- 
ization (DFE) variants [3], as well as lattice reduction (LR) 
aided versions of these [4], [5]. Substantial work further looked 
to analyze the performance of such receivers. For example, the 
work in [6] showed that LR-aided ZF decoding can achieve 
maximal receive diversity for uncoded V-BLAST. 

With the recent emergence of the diversity multiplexing 
tradeoff (DMT) [7] describing, in a unified manner, the funda- 
mental performance limits of outage-limited MIMO communi- 
cations, research has focused on establishing the DMT perfor- 
mance of different encoder/decoder architectures. The work 
in [8] proved that the naive lattice decoder fails to achieve 
the diversity multiplexing tradeoff in general. Furthermore, in 
[9], DMT analysis reveals that both ZF and MMSE linear 
receivers are suboptimal in terms of their achievable diversity. 
An important step towards establishing that DMT optimality 
can be achieved with computationally efficient encoders and 
decoders was presented in [10]. By using an ensemble of 
lattice codes, an MMSE pre-processing step, and an optimal 
lattice translate, it was shown that there exists lattice codes 
that, when decoded using lattice decoding, achieve optimal 
DMT performance over the i.i.d. Rayleigh fading channel. 

B. Contributions of present work 

In this work, we extend the results in [10] by bypassing 
random ensemble arguments to show that DMT optimality is 
achievable for all multiplexing gains, by employing explicitly 
constructed encoders and computationally efficient decoders. 
Specifically, we consider explicitly constructed approximately 
universal codes [1 1] — [13], and regularized lattice decoding. It 
is shown that DMT optimality holds for all fading statistics. 
The key to DMT optimality, as will be shown later, is 
the MMSE regularization of the decoding metric. We also 
establish the DMT optimality of the computationally efficient 
LLL based LR-aided MMSE decoder [5]. 

II. System model and space-time coding 
We consider the quasi-static nx x n-R MIMO channel model 

Y = HX + W (1) 

where Y 6 C" rxT , H G C" rX " t , X G c ,1txT for 
T > n T , W G C" RXT , and vcc(W) - Mc(0,T). Here, we 



use vcc(W) to denote the column-by-column vectorization of 
W, and A/c(0, 1) to denote a rotationally invariant circularly 
symmetric complex normal random vector with unit variance. 
The code matrices X are drawn from a space-time block code 
X, satisfying the power constraint 
1 



T 



|X||£<p VXeA". 



(2) 



A. The diversity multiplexing tradeoff 

The rate of an ST code X is given by R = T^ 1 log \ X\ and 
a sequence of codes or scheme, indexed by p, is said to have 
a multiplexing gain of r if (c.f. [7]) 

r R 
r = lim . 

p^oc logp 

When X is the output of the decoder (not necessarily ML) 
given Y and H, the diversity gain d of the scheme is 



d = — lim 

p — >oc 



logP(X^X) 
logp 



The central result of [7] is that for a fixed multiplexing gain 
r, there is a fundamental limit to the diversity gain: 



d < d out (r) = — lim 



logP(logdct(I + pUW) < R) 



(3) 



p^oo logp 

where R = r logp and where IF denotes the Hermitian trans- 
pose of H. In the case of i.i.d. Rayleigh fading, d out (r) is given 
by the piecewise linear curve connecting — fe)(riT — k) 
for k = 0, 1, . . . , min(nR, tit) [7]. A scheme which satisfies 
(01 with equality for some r is said to be DMT optimal for 
this multiplexing gain. 

We will in the following make use of the = notation (c.f. 
[7]) for exponential equalities where /(p) = p b is taken to 
mean lim^oo log /(p)/ logp = b. The symbols < and > are 
defined similarly. Let O e denote the e-no-outage set given by 

O e ={H | logdet(I + pHH f ) > (r + e) logp} . (4) 

As noted in [14] (see also [7]), a sufficient condition for DMT 
optimality, regardless of the fading statistics, is that 



P(X^X|H e O e ) = p - 



(5) 



for all e > 0, i.e. that the conditional probability of decoding 
error vanishes exponentially fast for channels in O e . 

B. Approximately universal lattice space-time codes 

In this paper, we consider a sequence of lattice ST codes: 

X = {X = Mat(0Gs) |se5 r } (6) 

where 9 6 K+, Ge C" tTxk for some k£N, and where 

^{seZglHsf^p 1 ?}, (7) 

where Zg = Z + iZ denotes the set of Gaussian integer^ 
Mat(x) denotes the jit x T matrix formed via column-by- 
column stacking of consecutive riT-tuples of x 6 C" tT . Each 

'Extensions of our main results to other constellations, such as the HEX 
constellations, is straightforward and will appear in a journal version of this 
work. It is omitted here due to lack of space. 



codeword is thus associated, via the lattice generator matrix 
G, to a unique data vector s £ S r C Zg. The choice of S r in 
(0 ensures a multiplexing gain r and choosing in order to 
satisfy the power constraint (c.f. (0) with equality implies that 
6 2 = p 1-1 ^. We assume throughout that the lattice generator 
matrix G is independent of p and r. 

A key feature of lattice ST codes is that they may be 
decoded by a class of decoders known as lattice decoders [10]. 
To this end we note that the input-output relation from s to 
y = vec(Y) is 

y = Fs + w (8) 
where w = vec(W), and where the effective channel matrix is 



F = 0(I T (g>H)G. 
The ML decoder is thus equivalent to (c.f. [10]) 



§ml = argmin ||y - Fs| 

ses r 



(9) 



(10) 



and may be approximated by a lattice decoder, whereby the 
constellation boundary imposed by S r is ignored [10]. The 
decoding of the lattice ST codes will be discussed in greater 
detail in Sections [III] and [IV] 

Let Pi(A) denote the ith eigenvalue of a Hermitian matrix 
A £ R qxq , ordered such that pi{A) < ... < p g (A). Further, 
let n = min(nx, tir ). A sequence of ST codes (not necessarily 
lattice codes) is approximately universal [14] over the nx XriR 
channel if and only if (c.f. [11]) 



^p,(AAt)>p" 



(11) 



for all codeword difference matrices A = Xi — X2, where 
Xi,X2 £ X, and Xi 7^ X2. It is known that approximate 
universality is a sufficient condition for DMT optimality 
for any fading statistics, assuming ML decoding [14]. For 
approximately universal codes we have the following lemma 
that follows directly from [14, Equation (21)0. 

Lemma 1: Let A = Xi - X 2 for Xj , X 2 £ X, Xi ^ X 2 . 
If X is approximately universal over the tit X tt-r channel and 
H £ O e it follows that ||HA||| > p». 

For approximately universal lattice ST codes we may also 
give the following corollary to Lemma [T] The proof is given 
in the appendix. 

Corollary 2: Let X be a lattice ST code of the form (O 
which, for a fixed lattice generator matrix G, is approximately 
universal for all multiplexing gains in a neighborhood of r. 
Then, for Si,s 2 £ S,.+q, Si ^ s 2 , and H £ O e it holds that 



||F(s n -s 2 )|| 2 >p- 
for sufficiently small £, < C < e. 



(12) 



2 In relation to the result presented here, we point out a small typo in 
equations (20) and (21) in [14], wherein (20) 2 fl ( 1+e ) (| Ai | ■ ■ • |A„ m |) 2 / n ™ 
should be replaced by (2 R ( 1+I! '> | A x | 2 ■ • ■ A„ m |2)V"m , c .f. (17) m me same 
paper. Note also the slightly different definition of C E in our paper and Oe 
in [14], where in the definition of O e we use (r + e) in place of r(l + e). 



Approximately universal lattice ST codes, which satisfy 
the conditions of Corollary [2] are known to exist for any 
(riR, nx)-tuplet and multiplying gain r, see e.g. [11]. The 
codes in [11] are in fact, for a fixed G, approximately universal 
over all r G [0,n]. In what follows we only consider codes 
for which Corollary |2] applies. We also point out that in 
the definition of approximately universal lattice ST codes we 
require that the set of data symbols <S r is given by the Gaussian 
integers within a hyper-sphere of radius p^ . It is readily seen 
that presented analysis carries over (at the expense of extra 
notational complexity) to the more practical case where the 
constellation is cubic, i.e. |3i(sfc)|, \$s(sk)\ < p^* , which also 
maintains the scheme's multiplexing gain. 

III. Regularized lattice decoding 

As noted in Section|II] ML decoding is equivalent to solving 
([Tol l. The naive lattice decoder (c.f. [10]) is obtained by simply 
ignoring the constellation boundary of S r C Zg: 



s = arg mm | y ■ 



Fsl 



(13) 



We count the event when the decoder decides in favor of a 
codeword not in the constellation as an error. The benefit of 
using (fT3l in place of (fTOb is that one may avoid the potentially 
complicated boundary control, and apply tools from lattice 
reduction theory for solving (fT3l . However, as argued in [10] 
and subsequently proved in [8], the naive lattice decoder is 
not in general DMT optimal. It was however also shown in 
[10] that the problem is not with lattice coding and decoding 
per se, but rather with the naive implementation. 

Intuitively, as the ML decoder (c.f. ( flOl i) is DMT optimal for 
approximately universal codes and the naive lattice decoder is 
not, the sub-optimality of the naive lattice decoder must stem 
from the fact that it decides, with high probability, in favor of 
codewords that do not belong to the constellation S r . Note here 
that s ^ S r , s G Zg, implies |[s|| 2 > p^r. Thus, having the 
decoder penalize vectors s with large norm, one can expect 
to reduce the probability of out-of-constellation errors. This 
amounts to regularization of the decoding metric and we let 
the a-regularized lattice decoder be given by 



arg mm ||y 



Fsl 



(14) 



Clearly, for a = the regularized lattice decoder coincides 
with the naive lattice decoder. We will however in what follows 
show that by choosing a appropriately, one can achieve DMT 
optimality for any approximately universal code. The result is 
captured by the following theorem. 

Theorem 3: Approximately universal lattice codes, decoded 
using the a-regularized decoder with a = p « , achieve DMT 
optimality and do so irrespective of the fading statistics. 

Proof: We will show that when s is the data vector corre- 
sponding to the transmitted codeword of an approximately 
universal code, when a = p~~ and when e > 0, using the a- 
regularized decoder in (TT~4b implies that P (s Q ^ s|H G O e ) = 



p~°°. In other words, the conditional probability of error van- 
ishes exponentially fast for channels (strictly) not in outage, 
establishing DMT optimality. 

Towards this end, given e > 0, choose £ and S such that 
< £ < e, where £ is sufficiently small for Corollary [2] to 
apply, and such that (c.f. (fl2l i) 

— i-^- >8>0 and — >5>0. (15) 

UK K 

This can always be done. Assume also that HeO £ and that 
the noise vector w satisfies ||w|| 2 < p s . 

Consider first the a-regularized metric (c.f. (TBI ) for the 
transmitted data vector s G S r - As (c.f. ^) 



Fsll 



it follows that 



|y-Fs| 



a||s|| 2 < p 5 + ap^ 



where we used that ||w|| 2 < p s , that a = p 



P° (16) 
* and that 



s G S r which implies ||s|| 2 < p 1 ^ (c.f. ©). 

For any data vector s G iS r +f, s ^ s, we note that 

||y-Fs|| = ||F(s-s)+w||>||F(s-s)||-||w||. 

As ||F(s - s)j| > p i^-^) by Corallary and as ||w|| < 
pi s , it follows by CLUl that 



|y-Fs|| 2 > P 



and 



|y-Fs|| 



■ a||s|| 2 >p £ " C C - T 



for any s G S r+ (, s ^ s. 

For s ^ Sr+o § e il h °l ds that Pll 2 > P~ 
(|7]i) by which it follows that 



By defining 



Fsl 



4 = nun 



> ap 



(17) 
(c.f. 

(18) 



n k k I 
where £ > S due to ( t!5t . and combining dTTb and dl~8b it 
follows that 

(19) 



|y-Fs|| 2 + a||s|| 2 >p« 



for any s G Zg, s ^ s. As 5 < £ (i.e. S is strictly smaller than 
£), it follows by ( TTol l and ( fl9] l that there is po such that 



Fs| 



a||s|| 2 <||y-Fs|| 



for any s G Zg, s ^ s, and p > po- This implies that the 
a-regularized decoder will make a correct decision. In other 
words, if H G O e , it follows that ||w|| 2 > p s constitutes a nec- 
essary condition for an error to occur when p > po- However, 
as P (||w|j 2 > p 5 ) = p~°° due to the exponential tails of the 
Gaussian distribution, we see that P (s Q ^ s|H G O e ) = p~°° 
and the claim of Theorem [3] follows. □ 
The metric in ( fl4b is not identical to the metric used in the 
MMSE-GDFE decoder considered in [10], although the two 



metrics share some key features. In particular, if the lattice 
translate is omitted, it can be shown that the metric in [10] is 
equivalent to (c.f. (TBI ) 

lly-Fsf + p^HflGsH 2 , (20) 

i.e. the regularization is applied to the vectorized codeword 
x = 8Gs instead of s. It is a straightforward exercise to 
repeat the proof of Theorem [3] and show that decoding with 
respect to d20b is also DMT optimal. To this end, note that 
9 2 p~ 1 = . In fact, when G is an orthogonal matrix, 

as is the case for perfect codes [12], ( f2Qb reduces to (TBI . 
This confirms the observation made in [10] that the "magic" 
ingredient of the GDFE-MMSE decoder, in terms of DMT 
optimality, is MMSE pre-processing. Similarly, it reveals that 
a = p~~ is the corresponding "magic" parameter for the 
regularized lattice decoder which motivates us to refer to the 
regularized lattice decoder with a = p « as the MMSE 
regularized lattice decoder. It should however be noted that the 
choice of a = p ~ can naturally also be directly obtained 
from the linear MMSE filter for s given the observation y (c.f. 
© and Section HV). 

IV. Lattice reduction aided decoding 

By "completing the squares", the a-regularized metric may 
equivalently be written as 

||y-Fs|| 2 + a ||s|| 2 = ||z-Rs|| 2 + C (21) 

where R e C KXK is a square root factor of F^F + al, i.e. 

RtR = F t F + aI, (22) 

where z = R~tpty, and where 

c = y t [l-F t (F t F + aI) -1 F]y > 0. (23) 

The a-regularized decoder can thus be expressed as 

s a = arg min ||z — Rs|| 2 . (24) 

The optimization problem in ( f24b however still require the 
solution to a closest vector problem (CVP), which is NP-hard 
in general. This makes sub-optimal solutions appealing. To 
this end, consider the decoder given by 

s q ,mmse = arg min ||R _1 z - s|| 2 . (25) 

The decoder in ( fZ5b is easily implemented by component-wise 
rounding of R *z to the nearest integer vector. It is relatively 
straightforward to verify that 

R x z = (F f F + oI) _1 F t y 

which implies that the solution to ( |25l l corresponds to the 
standard linear MMSE decoder. 

Yao and Wornell [4] suggested the use of lattice reduction 
to improve the approximation quality when replacing d24l by 

3 Note also that the metric in [10] is expressed in a real valued form which 
allows for more general code designs. The real valued reformulation will be 
considered in a journal version of this work. 



( |25l >. The key idea behind this approach is to note that (|24] | is 
equivalent to 

min llz - RTsll 2 (26) 

sezg 

where T is a unimodular matrix, i.e. T is a one-to-one map 
from Zg to Zg or equivalently, T g Z£ XK and | det(T)| = 1. 
We write R = RT in what follows, and refer to R as the 
lattice reduced channel. The process of finding T, given R, 
is known as lattice reduction. 

The sub-optimal solution corresponding to (|26T i is given by 

§q,lr-mmse = arg min ||R _1 z - s|| 2 (27) 

sezg 

where R = RT and the approximate solution to d24l i is 

Sq,LR-MMSE = Ts Qj lr-mmse • (28) 

The key observation of [4] is that by making R well con- 
ditioned (by the appropriate choice of T), the quality of the 
approximation may be significantly improved. The resulting 
decoder (defined by (|27| | and d28l i) is known as the LR-aided 
MMSE decoder [5]. 

The most commonly considered lattice reduction algorithm 
is the computationally efficient LLL algorithm [15]. The LLL 
algorithm is also known to provide maximum receive diversity, 
at multiplexing gain r — and under i.i.d. Rayleigh fading, 
for uncoded V-BLAST transmissions [6]. In what follows we 
prove that LLL based LR-aided decoding can in fact achieve 
the most general diversity-related optimality, by showing that 
the LLL based LR-aided MMSE decoder can, in the context 
of lattice codes, achieve the maximal diversity gain for all 
multiplexing gains r and fading statistics. 

Theorem 4: Approximately universal lattice codes, when 
decoded using the LLL based LR-aided MMSE decoder, 
achieve the optimal DMT tradeoff, and do so irrespective of 
fading statistics. 

Proof: To prove the above, we will demonstrate that 
P(s„,LR-MMSE^s|Hea) = p-°°. To this end, let 
R = RT be the LLL lattice reduced channel matrix. It follows 
by the bounded orthogonality defect of LLL reduced bases (c.f. 
[15] and the proof in [6]) that there is a constant K K > 0, 
independent of R, for which 

(WGR- 1 ) < (29) 

where cr max (R _1 ) is the largest singular value of R" 1 and 
where 

A(R) = min ||Rc|| (30) 

cezg\{o} 

denotes the shortest vector in the lattice generated by R. 
Although the proof in [6] was given for real valued bases 
it straightforwardly extends to the complex case, c.f. [16]. 

Assume, as in the proof of Theorem [3] that H 6 O e and 
|| w || 2 < p s . For s g Zg, s ^ s, it follows that 

||z-Rs|| = ||(z-Rs) + R(s-s)|| 
< ||R(s-a)|| + ||z-Rs|| 



and 

||R(s — s) || > ||z - Rs|| - ||z - Rs|| 

>(jt-c)*-\\*-Ba\\ (31) 

where the last inequality follows by combining $1% and i2li . 
As c<p s and ||z — Rs|| 2 <// by ( fl6l ) and ( ETT i. and since 
£ > S, we may conclude from d3TT > that ||R(s — s)|| 2 > p£, for 
any s 6 Zg, s ^ s. By identifying c = s — s G Zg\{0} in 
(TO it follows that A 2 (R) > p« and by (gUl that 

^a X (R^)<^- (32) 

From d27l i and d28l l it may be seen that s q xr-mmse 7^ s 
if and only if s ai LR-MMSE 7^ s where s = T _1 s. The metric 
in (|27| i, evaluated for s = s, satisfies 

HR^z-sH 2 = HR-^z-Rs)!! 2 

< f TS iax (R-- 1 )|| Z -Rs|| 2 <p 5 -€ (33) 

where the last inequality follows by (f32b together with |z — 
Rs|| 2 < p 5 and Rs = Rs. For s G Zg, s ^ s, it follows that 

||R _1 z - s|| = ||R -1 z: — s + (s — s)|| 
> || s - s|| - ||R _1 z - s|| 

By noting that |s-s|j 2 > 1 if s ^ s, that ||R _1 z-s|| 2 < p s ~^ 
(c.f. (|33j) and that 6 - £ < 0, it follows that 

HR-^-sf^p (34) 

for s G Zg, s 7^ s. Combining (l33l and (l34l yields 
||R _1 z - s|| 2 < ||R _1 z - s|| 2 for all s G Zg, s ^ s, and 
sufficiently large p implying that the decision of the LR- 
aided MMSE decoder (c.f. and (|28]l) is correct. As in 
the proof of Theorem [3] we see that given H G O t it must 
hold that ||w|| 2 > p s for an error to occur, which implies 
P(s q ,lr-mmse^s|Hg0 6 ) =p-°°. □ 

V. Conclusion 

In this paper, we consider the problem of efficiently de- 
coding approximately universal lattice ST codes. We show 
that MMSE regularized lattice decoding in general, and the 
computationally efficient LLL based LR-aided MMSE decoder 
in particular, realize the maximum receive diversity and thus 
DMT optimality for approximately universal lattice codes. The 
result holds for any fading statistics and confirms that the 
key to achieving DMT optimality is the regularization of the 
decoding metric provided by the MMSE decoder. 

Appendix 

Proof of Corollary [2} By the equivalent channel model (c.f. 
©, © and (O) and Lemma Q] it follows that 

||H(X 1 -X 2 )||| H|F( Sl -s 2 )|| 2 >^ 

for Si,S2 G S r , si 7^ s 2 , given that H G O e . For the un- 
normalized equivalent channel F =(It ® H)G we have 

e 2 ||F( Sl - S2 )|| 2 = p 1 -^||F(s 1 -s 2 )|| 2 >p^ ) 



where F is independent of p and r (note that F = 9~F). 
Consider now the application of Lemma Q] to a scheme with 
multiplexing gain r' = r + C> where < £ < e. By the 
assumption that H G O t it follows that 

logdetfl + pHHt) > (r + e)logp= (/ + e - C) log p 

which by the application of Lemma Q] implies that 

^o 1 -^ ||F(bi - s a ) || 2 > (35) 

for si,S2 G S r i, si 7^ S2. Rewriting ( f35l > it terms of 
r yields p 1 - ! ^\\¥{s 1 — s 2 )|| 2 > p » * or equivalently 
||F(si - s 2 )|| 2 > p^s-^V for any si,s 2 G S r > = S r+( , 
Si ^ s 2 . □ 
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